This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation. DRAW networks combine a novel spatial attention mechanism that mimics the foveation of the human eye, with a sequential variational auto-encoding framework that allows for the iterative construction of complex images. The system substantially improves on the state of the art for generative models on MNIST, and, when trained on the Street View House Numbers dataset, it generates images that cannot be distinguished from real data with the naked eye.
translated by 谷歌翻译
The automated segmentation and tracking of macrophages during their migration are challenging tasks due to their dynamically changing shapes and motions. This paper proposes a new algorithm to achieve automatic cell tracking in time-lapse microscopy macrophage data. First, we design a segmentation method employing space-time filtering, local Otsu's thresholding, and the SUBSURF (subjective surface segmentation) method. Next, the partial trajectories for cells overlapping in the temporal direction are extracted in the segmented images. Finally, the extracted trajectories are linked by considering their direction of movement. The segmented images and the obtained trajectories from the proposed method are compared with those of the semi-automatic segmentation and manual tracking. The proposed tracking achieved 97.4% of accuracy for macrophage data under challenging situations, feeble fluorescent intensity, irregular shapes, and motion of macrophages. We expect that the automatically extracted trajectories of macrophages can provide pieces of evidence of how macrophages migrate depending on their polarization modes in the situation, such as during wound healing.
translated by 谷歌翻译
The ability to convert reciprocating, i.e., alternating, actuation into rotary motion using linkages is hindered fundamentally by their poor torque transmission capability around kinematic singularity configurations. Here, we harness the elastic potential energy of a linear spring attached to the coupler link of four-bar mechanisms to manipulate force transmission around the kinematic singularities. We developed a theoretical model to explore the parameter space for proper force transmission in slider-crank and rocker-crank four-bar kinematics. Finally, we verified the proposed model and methodology by building and testing a macro-scale prototype of a slider-crank mechanism. We expect this approach to enable the development of small-scale rotary engines and robotic devices with closed kinematic chains dealing with serial kinematic singularities, such as linkages and parallel manipulators.
translated by 谷歌翻译
Neural network (NN) potentials promise highly accurate molecular dynamics (MD) simulations within the computational complexity of classical MD force fields. However, when applied outside their training domain, NN potential predictions can be inaccurate, increasing the need for Uncertainty Quantification (UQ). Bayesian modeling provides the mathematical framework for UQ, but classical Bayesian methods based on Markov chain Monte Carlo (MCMC) are computationally intractable for NN potentials. By training graph NN potentials for coarse-grained systems of liquid water and alanine dipeptide, we demonstrate here that scalable Bayesian UQ via stochastic gradient MCMC (SG-MCMC) yields reliable uncertainty estimates for MD observables. We show that cold posteriors can reduce the required training data size and that for reliable UQ, multiple Markov chains are needed. Additionally, we find that SG-MCMC and the Deep Ensemble method achieve comparable results, despite shorter training and less hyperparameter tuning of the latter. We show that both methods can capture aleatoric and epistemic uncertainty reliably, but not systematic uncertainty, which needs to be minimized by adequate modeling to obtain accurate credible intervals for MD observables. Our results represent a step towards accurate UQ that is of vital importance for trustworthy NN potential-based MD simulations required for decision-making in practice.
translated by 谷歌翻译
By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance. While this capability has been demonstrated in other fields such as computer vision, natural language processing or speech recognition, it remains to be shown in robotics, where the generalization capabilities of the models are particularly critical due to the difficulty of collecting real-world robotic data. We argue that one of the keys to the success of such general robotic models lies with open-ended task-agnostic training, combined with high-capacity architectures that can absorb all of the diverse, robotic data. In this paper, we present a model class, dubbed Robotics Transformer, that exhibits promising scalable model properties. We verify our conclusions in a study of different model classes and their ability to generalize as a function of the data size, model size, and data diversity based on a large-scale data collection on real robots performing real-world tasks. The project's website and videos can be found at robotics-transformer.github.io
translated by 谷歌翻译
Fake news detection has become a research area that goes way beyond a purely academic interest as it has direct implications on our society as a whole. Recent advances have primarily focused on textbased approaches. However, it has become clear that to be effective one needs to incorporate additional, contextual information such as spreading behaviour of news articles and user interaction patterns on social media. We propose to construct heterogeneous social context graphs around news articles and reformulate the problem as a graph classification task. Exploring the incorporation of different types of information (to get an idea as to what level of social context is most effective) and using different graph neural network architectures indicates that this approach is highly effective with robust results on a common benchmark dataset.
translated by 谷歌翻译
Choosing which properties of the data to use as input to multivariate decision algorithms -- a.k.a. feature selection -- is an important step in solving any problem with machine learning. While there is a clear trend towards training sophisticated deep networks on large numbers of relatively unprocessed inputs (so-called automated feature engineering), for many tasks in physics, sets of theoretically well-motivated and well-understood features already exist. Working with such features can bring many benefits, including greater interpretability, reduced training and run time, and enhanced stability and robustness. We develop a new feature selection method based on Distance Correlation (DisCo), and demonstrate its effectiveness on the tasks of boosted top- and $W$-tagging. Using our method to select features from a set of over 7,000 energy flow polynomials, we show that we can match the performance of much deeper architectures, by using only ten features and two orders-of-magnitude fewer model parameters.
translated by 谷歌翻译
深度学习算法的最新进展为解决许多医学图像分析问题带来了重大好处。培训深度学习模型通常需要具有专家标记注释的大型数据集。但是,获取专家标记的注释不仅昂贵,而且主观,容易出错,并且观察者内部变异性会引入标签。由于解剖学的模棱两可,使用深度学习模型来细分医学图像时,这尤其是一个问题。基于图像的医学诊断工具使用经过不正确分段标签训练的深度学习模型可以导致错误的诊断和治疗建议。与单评论注释相比,多评价者注释可能更适合于使用小型培训集的深度学习模型进行训练。本文的目的是开发和评估一种基于MRI中病变特征的多评价者注释和解剖学知识来生成概率标签的方法,以及一种使用概率的标签使用归一化活动性损失作为A的病变特征的解剖学知识,以训练分割模型”。耐噪声损失的功能。通过将17个膝盖MRI扫描的二进制基础真理进行比较,以评估该模型,以用于临床分割和检测骨髓病变(BML)。该方法与二进制跨透镜损失函数相比,该方法成功提高了精度14,召回22和骰子得分8%。总体而言,这项工作的结果表明,使用软标签的拟议归一化主动损失成功地减轻了嘈杂标签的影响。
translated by 谷歌翻译
已经证明,经过代码完成培训的大型语言模型(LLMS)能够合成DocStrings的简单Python程序[1]。我们发现这些代码编写的LLM可以被重新使用以编写机器人策略代码,给定自然语言命令。具体而言,策略代码可以表达处理感知输出的功能或反馈循环(例如,从对象检测器[2],[3])并参数化控制原始API。当作为输入提供了几个示例命令(格式为注释)后,然后是相应的策略代码(通过少量提示),LLMS可以接收新命令并自主重新编写API调用以分别生成新的策略代码。通过链接经典的逻辑结构并引用第三方库(例如,numpy,shapely)执行算术,以这种方式使用的LLM可以编写(i)(i)表现出空间几何推理的机器人策略,(ii)(ii)将其推广到新的说明和新指令和新指令和(iii)根据上下文(即行为常识)规定模棱两可的描述(例如“更快”)的精确值(例如,速度)。本文将代码作为策略介绍:语言模型生成程序的以机器人为中心的形式化(LMP),该程序可以代表反应性策略(例如阻抗控制器),以及基于Waypoint的策略(基于远见的选择,基于轨迹,基于轨迹,控制),在多个真实的机器人平台上展示。我们方法的核心是促使层次代码 - 代码(递归定义未定义的功能),该代码可以编写更复杂的代码,还可以改善最新的代码,以解决HOMANEVAL [1]基准中的39.8%的问题。代码和视频可从https://code-as-policies.github.io获得。
translated by 谷歌翻译
机器人的共同适应一直是一项长期的研究努力,其目的是将系统的身体和行为适应给定的任务,灵感来自动物的自然演变。共同适应有可能消除昂贵的手动硬件工程,并提高系统性能。共同适应的标准方法是使用奖励功能来优化行为和形态。但是,众所周知,定义和构建这种奖励功能是困难的,并且通常是一项重大的工程工作。本文介绍了关于共同适应问题的新观点,我们称之为共同构图:寻找形态和政策,使模仿者可以紧密匹配演示者的行为。为此,我们提出了一种通过匹配示威者的状态分布来适应行为和形态的共同模拟方法。具体而言,我们专注于两种代理之间的状态和动作空间不匹配的挑战性情况。我们发现,共同映射会增加各种任务和设置的行为相似性,并通过将人的步行,慢跑和踢到模拟的人形生物转移来证明共同映射。
translated by 谷歌翻译